192 research outputs found

    Parallel SHVC decoder: Implementation and analysis

    Get PDF
    International audienceThe new Scalable High efficiency Video Coding (SHVC) standard is based on a multi-loop coding structure which requires the total decoding of all intermediate layers. The decoding complexity becomes then a real issue, especially for a real time decoding of ultra high video resolutions. A parallel processing architecture is proposed to reduce both the decoding time and the latency of the SHVC decoder. The proposed solution combines the high level parallel processing solutions defined in the HEVC standard with an extension of the frame-based parallelism. The latter solution enables the decoding of several spatial and temporal SHVC frames in parallel to enhance both decoding frame rate and latency. The wavefront parallel processing solution is used for more coarse level of granularity. The proposed hybrid parallel processing approach achieves a near optimal speedup and provides a good trade-off between decoding time, latency and memory usage. On a 6 cores Xeon processor, the parallel SHVC decoder performs a real time decoding of 1600p60 video resolution

    ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image Compression

    Full text link
    Over the last few years, neural image compression has gained wide attention from research and industry, yielding promising end-to-end deep neural codecs outperforming their conventional counterparts in rate-distortion performance. Despite significant advancement, current methods, including attention-based transform coding, still need to be improved in reducing the coding rate while preserving the reconstruction fidelity, especially in non-homogeneous textured image areas. Those models also require more parameters and a higher decoding time. To tackle the above challenges, we propose ConvNeXt-ChARM, an efficient ConvNeXt-based transform coding framework, paired with a compute-efficient channel-wise auto-regressive prior to capturing both global and local contexts from the hyper and quantized latent representations. The proposed architecture can be optimized end-to-end to fully exploit the context information and extract compact latent representation while reconstructing higher-quality images. Experimental results on four widely-used datasets showed that ConvNeXt-ChARM brings consistent and significant BD-rate (PSNR) reductions estimated on average to 5.24% and 1.22% over the versatile video coding (VVC) reference encoder (VTM-18.0) and the state-of-the-art learned image compression method SwinT-ChARM, respectively. Moreover, we provide model scaling studies to verify the computational efficiency of our approach and conduct several objective and subjective analyses to bring to the fore the performance gap between the next generation ConvNet, namely ConvNeXt, and Swin Transformer.Comment: arXiv admin note: substantial text overlap with arXiv:2307.02273. text overlap with arXiv:2307.0609

    Multi-core software architecture for the scalable HEVC decoder

    Get PDF
    International audienceThe scalable high efficiency video coding (SHVC) standard aims to provide features of temporal, spatial and quality scalability. In this paper we investigate a pipeline and parallel software architecture for the SHVC decoder. The proposed architecture is based on the OpenHEVC software which implements the high efficiency video coding (HEVC) decoder. The architecture of the SHVC decoder enables two levels of parallelism. The first level decodes the base layer and the enhancement layers in parallel. The second level of parallelism performs the decoding of both the base layer and enhancement layers in parallel through the HEVC high level parallel processing solutions, including tile and wavefront. Up to the best of our knowledge, it is the first real time and parallel software implementation of the SHVC decoder. On an Intel Xeon processor running at 3.2 GHz, the SHVC decoder reaches the decoding of 1600p enhancement layer at 40 fps for x1.5 spatial scalability with using six concurent threads

    4K real time video streaming with SHVC decoder and GPAC player

    Get PDF
    International audienceThis paper presents the first 4Kp30 end-to-end video streaming demonstration based on the upcoming Scalable High efficiency Video Coding (SHVC) standard. The optimized and parallel SHVC decoder is used under the GPAC player to decode and display in real time the received SHVC layers. The SHVC reference software model (SHM) is used to encode the 4K original video in two spatial scalability layers: the base layer at 1080p resolution and the enhancement layer at 2160p resolution. The SHVC bitstream is encapsulated with the GPAC multimedia library into MP4 file format. The GPAC player at the server side broadcasts the MP4 content in MPEG-2 TS. At the client side, the GPAC player receives the SHVC video packets which are decoded by the SHVC decoder and then rendered in real time by the player. The GPAC player provides an interactive interface enabling to switch between displaying the base and the enhancement layers

    Ensemble Learning for Efficient VVC Bitrate Ladder Prediction

    Full text link
    Changing the encoding parameters, in particular the video resolution, is a common practice before transcoding. To this end, streaming and broadcast platforms benefit from so-called bitrate ladders to determine the optimal resolution for given bitrates. However, the task of determining the bitrate ladder can usually be challenging as, on one hand, so-called fit-for-all static ladders would waste bandwidth, and on the other hand, fully specialized ladders are often not affordable in terms of computational complexity. In this paper, we propose an ML-based scheme for predicting the bitrate ladder based on the content of the video. The baseline of our solution predicts the bitrate ladder using two constituent methods, which require no encoding passes. To further enhance the performance of the constituent methods, we integrate a conditional ensemble method to aggregate their decisions, with a negligibly limited number of encoding passes. The experiment, carried out on the optimized software encoder implementation of the VVC standard, called VVenC, shows significant performance improvement. When compared to static bitrate ladder, the proposed method can offer about 13% bitrate reduction in terms of BD-BR with a negligible additional computational overhead. Conversely, when compared to the fully specialized bitrate ladder method, the proposed method can offer about 86% to 92% complexity reduction, at cost the of only 0.8% to 0.9% coding efficiency drop in terms of BD-BR

    LAR Image transmission over fading channels: a hierarchical protection solution

    Get PDF
    International audienceThe aim of this paper is to present an efficient scheme to transmit a compressed digital image over a non frequency selective Rayleigh fading channel. The proposed scheme is based on the Locally Adaptive Resolution (LAR) algorithm, and the Reed-Solomon error correcting code is used to protect the data against the channel errors. In order to optimize the protection rate and ensure better protection we introduce an Unequal Error Protection (UEP) strategy, where we take the hierarchy of the information into account. The digital communication system also includes appropriate interleaving and differential modulation. Simulation results clearly show that our scheme presents an efficient solution for image transmission over wireless channels, and provides a high quality of service, outperforming the JPWL scheme in high bit error rate conditions

    Bitrate Ladder Prediction Methods for Adaptive Video Streaming: A Review and Benchmark

    Full text link
    HTTP adaptive streaming (HAS) has emerged as a widely adopted approach for over-the-top (OTT) video streaming services, due to its ability to deliver a seamless streaming experience. A key component of HAS is the bitrate ladder, which provides the encoding parameters (e.g., bitrate-resolution pairs) to encode the source video. The representations in the bitrate ladder allow the client's player to dynamically adjust the quality of the video stream based on network conditions by selecting the most appropriate representation from the bitrate ladder. The most straightforward and lowest complexity approach involves using a fixed bitrate ladder for all videos, consisting of pre-determined bitrate-resolution pairs known as one-size-fits-all. Conversely, the most reliable technique relies on intensively encoding all resolutions over a wide range of bitrates to build the convex hull, thereby optimizing the bitrate ladder for each specific video. Several techniques have been proposed to predict content-based ladders without performing a costly exhaustive search encoding. This paper provides a comprehensive review of various methods, including both conventional and learning-based approaches. Furthermore, we conduct a benchmark study focusing exclusively on various learning-based approaches for predicting content-optimized bitrate ladders across multiple codec settings. The considered methods are evaluated on our proposed large-scale dataset, which includes 300 UHD video shots encoded with software and hardware encoders using three state-of-the-art encoders, including AVC/H.264, HEVC/H.265, and VVC/H.266, at various bitrate points. Our analysis provides baseline methods and insights, which will be valuable for future research in the field of bitrate ladder prediction. The source code of the proposed benchmark and the dataset will be made publicly available upon acceptance of the paper

    Machine Learning based Efficient QT-MTT Partitioning Scheme for VVC Intra Encoders

    Full text link
    The next-generation Versatile Video Coding (VVC) standard introduces a new Multi-Type Tree (MTT) block partitioning structure that supports Binary-Tree (BT) and Ternary-Tree (TT) splits in both vertical and horizontal directions. This new approach leads to five possible splits at each block depth and thereby improves the coding efficiency of VVC over that of the preceding High Efficiency Video Coding (HEVC) standard, which only supports Quad-Tree (QT) partitioning with a single split per block depth. However, MTT also has brought a considerable impact on encoder computational complexity. In this paper, a two-stage learning-based technique is proposed to tackle the complexity overhead of MTT in VVC intra encoders. In our scheme, the input block is first processed by a Convolutional Neural Network (CNN) to predict its spatial features through a vector of probabilities describing the partition at each 4x4 edge. Subsequently, a Decision Tree (DT) model leverages this vector of spatial features to predict the most likely splits at each block. Finally, based on this prediction, only the N most likely splits are processed by the Rate-Distortion (RD) process of the encoder. In order to train our CNN and DT models on a wide range of image contents, we also propose a public VVC frame partitioning dataset based on existing image dataset encoded with the VVC reference software encoder. Our proposal relying on the top-3 configuration reaches 46.6% complexity reduction for a negligible bitrate increase of 0.86%. A top-2 configuration enables a higher complexity reduction of 69.8% for 2.57% bitrate loss. These results emphasis a better trade-off between VTM intra coding efficiency and complexity reduction compared to the state-of-the-art solutions

    H2B2VS (HEVC Hybrid Broadcast Broadband Video Services) – building innovative solutions over hybrid networks

    Get PDF
    Broadcast and broadband networks continue to be separate worlds in the video consumption business. Some initiatives such as HbbTV have built a bridge between both worlds, but its application is almost limited to providing links over the broadcast channel to content providers’ applications such as Catch-up TV services. When it comes to reality, the user is using either one network or the other. H2B2VS is a Celtic-Plus project aiming at exploiting the potential of real hybrid networks by implementing efficient synchronization mechanisms and using new video coding standard such as High Efficiency Video Coding (HEVC). The goal is to develop successful hybrid network solutions that enable value added services with an optimum bandwidth usage in each network and with clear commercial applications. An example of the potential of this approach is the transmission of Ultra-HD TV by sending the main content over the broadcast channel and the required complementary information over the broadband network. This technology can also be used to improve the life of handicapped persons: Deaf people receive through the broadband network a sign language translation of a programme sent over the broadcast channel; the TV set then displays this translation in an inset window. One of the most important contributions of the project is developing and testing synchronization methods between two different networks that offer unequal qualities of service with significant differences in delay and jitter. In this paper, the main technological project contributions are described, including SHVC, the scalable extension of HEVC and a special focus on the synchronization solution adopted by MPEG and DVB. The paper also presents some of the implemented practical use cases, such as the sign language translation described above, and their performance results so as to evaluate the commercial application of this type of solution
    • …
    corecore